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ABSTRACT 

DXZ4 is an X-linked macrosatellite composed 
of 12-100 tandemly arranged 3-kb repeat units. 
In females, it adopts opposite chromatin ar- 
rangements at the two alleles in response to 
X-chromosome inactivation. In males and on the 
active X chromosome, it is packaged into hetero- 
chromatin, but on the inactive X chromosome (Xi), 
it adopts a euchromatic conformation bound 
by CTCF. Here we report that the ubiquitous tran- 
scription factor YY1 associates with the euchro- 
matic form of DXZ4 on the Xi. The binding of YY1 
close to CTCF is reminiscent of that at other 
epigenetically regulated sequences, including sites 
of genomic imprinting, and at the X-inactivation 
centre, suggesting a common mode of action in 
this arrangement. As with CTCF, binding of YY1 to 
DXZ4 in vitro is not blocked by CpG methylation, 
yet in vivo both proteins are restricted to the 
hypomethylated form. In several male carcinoma 
cell lines, DXZ4 can adopt a Xi-like conformation 
in response to cellular transformation, charac- 
terized by CpG hypomethylation and binding of 
YY1 and CTCF. Analysis of a male melanoma cell 
line and normal skin cells from the same individual 
confirmed that a transition in chromatin state 
occurred in response to transformation. 

INTRODUCTION 

Now that most of the human genome sequence has been 
assembled (1), one of several pressing issues is to deter- 
mine the role of the various DNA elements in estabhshing 
and maintaining the gene-expression profiles that underlie 



different cell types. Because only a relatively small portion 
of the genome actually codes protein (2), about half 
is composed of repetitive DNA (1), and a significant 
fraction is transcribed into non-coding RNA (3), many 
challenges he ahead. Intriguing aspects of the human 
genome are the extensive variation among individuals in 
copy number (copy-number variation, CNV) (4) and the 
consequences associated with such diversity. 

Macrosatellites are an extreme form of CNV. They 
are composed of individual repeat units, typically > 1 kb 
in size, that are arranged in tandem, often spanning 
hundreds of kilobases (5). Some are chromosome 
specific, such as ZAV at chromosome 9q32 (6) and 
DXZ4 at chromosome Xq23 (7,8), whereas others are 
found on at least two chromosomes, such as D4Z4 on 
chromosomes 4q35 and 10q26 (9,10) or RS447 on 
chromosome 4pl5 and 18p23 (11). The macrosatelhte 
about which most is known is D4Z4, because of its 
association with onset of facioscapulohumeral muscular 
dystrophy (FSHD) (12). FSHD, the third most common 
inherited form of muscular dystrophy (OMIM 158 900), is 
primarily manifested as progressive muscle atrophy of the 
face, shoulders and upper arms and is often accompanied 
by gradual spread of symptoms to the lower body (13). 
D4Z4 is a tandem array of as many as 100 3. 3-kb repeat 
units. Contraction of the macrosatelhte to fewer than 11 
repeat units on a permissive chromosomal haplotype (14) 
is associated with disease onset (15). Recent data implicate 
an altered chromatin structure for the contracted array 
(16-18) and stabilization of transcripts originating from 
the distal edge of the macrosatellite as the molecular 
basis of the disease (19). Collectively, these data highhght 
the importance of macrosatelhte CNV for disease 
susceptibility. 

Primary DNA-sequence conservation of the X-linked 
macrosatelhte DXZ4 is restricted to higher primates 
(20). In humans it is composed of as few as 12 to over 
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100 3-kb GC-rich repeat units located at Xq23 (7,8). Like 
many other macrosatellites (6,21,22), DXZ4 is expressed 
(8,23), but unlike that of D4Z4 (24) its transcript contains 
no obvious conserved open reading frames. X-linkage 
exposes DXZ4 to the process of X-chromosome inactiva- 
tion (XCI), a form of dosage compensation employed by 
mammals to balance the levels of X-linked gene expression 
in the two sexes (25). Early in female development, one 
of the two X chromosomes is chosen to become the future 
inactive X chromosome (Xi) (26). Soon thereafter 
most gene expression from the Xi is shut down (27), and 
silencing is stably maintained as the chromosome is 
repackaged into facultative heterochromatin (28). 
Although most of the X chromosome adopts this new 
chromatin structure, DXZ4 does not conform and 
instead adopts a more euchromatic organization charac- 
terized by CpG hypomethylation (7,23), dimethylation of 
histone H3 at lysine residue 4 (H3K4me2), acetylation 
of histone H3 at lysine 9 (H3K9Ac), and association 
with the epigenetic organizer protein CCCTC-binding 
factor (CTCF) (23). In contrast, DXZ4 on the active 
X chromosome (Xa) and male X is organized into con- 
stitutive heterochromatin characterized by CpG 
hypermethylation (7,23), trimethylation of histone H3 at 
lysine 9 (H3K9me3) (23) and association with heterochro- 
matin protein 1 gamma (HPlg) (18). Intriguingly, DXZ4 
shows several parallels with the mouse X-inactivation 
centre (Xic), a region of the X chromosome required 
for XCI (26). These include differential CpG methylation 
(29,30), transcription of non-coding RNAs (26), a tandem 
repeat sequence (30) and association with Ctcf (31). In 
addition, the ubiquitous zinc-finger protein Yin- Yang 1 
(YYl) associates with the Xic (32). 

Here, we report work showing that YYl associated 
with DXZ4 specifically on the Xi alongside CTCF. 
Although in vitro CpG methylation was unable to block 
YYl binding, YYl and CTCF were restricted in vivo to 
the hypomethylated DXZ4 allele on the Xi. Furthermore, 
we report changes to DXZ4 chromatin on the male X as a 
result of cell transformation. 



MATERIALS AND METHODS 

Antibodies 

Commercial primary antibodies used in this study were 
obtained from the following sources: mouse anti-CTCF 
(612 149), BD Biosciences. Rabbit anti-CTCF (07-729), 
rabbit anti-H3K4me2 (07-030) and rabbit 
anti-H3K9me3 (07-523), Milhpore. Rabbit anti-GAPDH 
(sc-25778), goat anti-YY2 (Sc-47637) and rabbit anti-YYl 
(sc-1703), Santa Cruz Biotechnology. Rabbit anti-YYl 
(ABl, AV38301). Sigma-Aldrich. Commercial secondary 
antibodies used in this study, Alexa Fluor-488 goat 
anti-rabbit (AllOOl) and Alex Fluor-555 goat anti-mouse 
(A21422), were obtained from Invitrogen Corporation. 

Cells and cell lines 

Human telomerase-immortalized female retinal pigment 
epithelia (hTERT-RPEl), female mammary epithelia 
(hTERT-HMEl) and male foreskin fibroblasts 



(hTERT-BJl) were all originally obtained from 
Clontech. All are now available from the American 
Type Culture Collection (ATCC). Primary male skin 
fibroblast cells CCD1139Sk (CRL-2708) and Malme-3 
(HTB-102) were obtained from ATCC. The following 
cell lines were all obtained from ATCC: female cervical 
adenocarcinoma cell fine HeLa (CCL-2); Malme-3M 
(HTB-64), derived from a male malignant melanoma; 
male hepatocellular carcinoma cell fine HepG2 
(HB-8065); male fibrosarcoma cell line HT-1080 
(CCL-121); male colorectal adenocarcinoma cell fines 
SW620 (CCL-227), SW480 (CCL-228), HCT-15 
(CCL-225), DLD-1 (CCL-221), Caco-2 (HTB-37), 
HCT116 (CCL-247) and SW1116 (CCL-233). AU cells 
were maintained according to the supplier recommenda- 
tions. Tissue-culture media and supplements were 
purchased from Invitrogen Corporation. 

Immunofluorescence and fluorescence in situ hybridization 

CeUs were grown directly on glass microscope sfides 
before indirect immunofluorescence. They were washed 
with Ix phosphate-buffered saline (PBS) before fixation 
and extraction in fixative for lOmin (1 x PBS, 0.1% Triton 
X-100, 3.7% formaldehyde). Cells were washed twice with 
Ix PBS before being blocked for 30min (3% bovine 
serum albumin, BSA; Ix PBS; 0.1% Tween-20) and 
were then washed three times for 2min each in Ix PBS 
before incubation for 60min with the primary antibody 
(1:100 dilution of primary antibody in Ix PBS, 0.1% 
Tween-20, 1% BSA). Cells were washed as above before 
incubation with the secondary antibody (1:100 dilution 
of secondary antibody in Ix PBS, 0.1% Tween-20, 
1% BSA), then washed once more and fixed as 
above before two washes of 2min in Ix PBS and 
addition of anti-fade-containing 4',6-diamidino-2- 
phenylindole (DAPI). All steps were performed at room 
temperature. 

Indirect immunofluorescence combined with fluores- 
cence in situ hybridization (FISH) was performed as 
above with the following additional steps before applica- 
tion of DAPI. Cells were dehydrated for 3min each in 
70% and 100% ethanol before being air-dried. Dried 
cells were denatured in 70% formamide, 2x sahne- 
sodium citrate buffer (2x SSC; pH 7.0) for lOmin at 
83°C, before dehydration in ice-cold 70% and 100% 
ethanol for 3min each and air-drying. A Spectrum 
Orange direct-labeled DXZ4 probe was prepared by nick 
translation in the presence of Spectrum Orange dUTP ac- 
cording to the manufacturer's instructions (Abbot 
Molecular). The DXZ4 probe was resuspended in 
Hybrisol VII (MP Biomedicals) and denatured at 78° C 
for 5min before being placed on ice. The probe was 
applied to the cells and sealed under a cover-slip with 
rubber cement and incubated overnight at 37°C in a 
humidified chamber. The following day, the cells were 
washed twice for 8min each in 50% formamide, 2x SSC 
at 42°C, then washed once for 8min in 2x SSC at 42°C, 
before being mounted in DAPI. 

Imaging was performed on a Delta Vision pDV. The 
images were deconvolved with softWoRx 3.7.0 (Applied 
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Precision) and compiled with Adobe Photoshop CS2 
(Adobe Systems). 

Chromatin immunoprecipitation 

Cells were fixed in culture media for lOmin at room tem- 
perature by addition of formaldehyde to 1 % final concen- 
tration. Cross-linking was quenched for 5 min by addition 
of glycine to 125 niM final concentration. Cells were 
washed and collected with ice-cold 1 x PBS supplemented 
with O.lmg/ml phenylmethylsulfonyl fluoride. CeUs were 
resuspended at 7 x 10~^ cells/ml in lysis buffer (1% SDS, 
lOmM EDTA, 50mM Tris, pH 8.0, containing 2ng/ml 
Aprotinin, 2 |ig/ml Leupeptin, 1 |ig/ml Pepstatin and 
O.lmg/ml phenylmethylsulfonyl fluoride). Chromatin 
was sheared with a Bioruptor (Diagenode) set at high 
power, 30 s on, 30 s off. Sonication was performed with 
0.2 ml of lysate per 1.7-ml tube for 12 cycles; the bath was 
cooled with ice every fourth cycle. Lysate was precleared 
with protein-A agarose beads, before addition of primary 
antibody and incubation overnight at 4°C. Immune 
complexes were collected the next day with protein-A 
agarose and washed at 4°C twice for 5 min each with 
low wash buffer (0.1% SDS, 1% Triton X-100, 2mM 
EDTA, 150mM NaCl, 20 mM Tris, pH 8.0), once for 
5 min with high wash buffer (0.1% SDS, 1% Triton 
X-100, 500 niM NaCl, 20 mM Tris, 8.0) and once for 
5 min with TE buffer (10 mM Tris, pH 8.0, 1 mM 
EDTA). Protease inhibitors were used at the same concen- 
tration as in the lysis buffer. Chromatin was eluted at 
room temperature with 100 mM NaHCOs, 1% SDS and 
cross-links reversed overnight at 65°C after addition of 
NaCl to 0.2 M. Residual RNA was removed for 30 min 
at 37°C with RNase A, then protein by a 120-min incu- 
bation at 42° C with proteinase K. DNA was purified with 
the QIAquick PCR purification kit (Qiagen). Protein-A 
agarose, RNase A, proteinase K and all protease inhibi- 
tors were obtained from Roche Applied Science. 

DNA immunoprecipitated with either CTCF or YYl 
was assessed by PCR with DXZ4-F23 (GGACAGTCCC 
AAGCCACTC) and DXZ4-R26 (AGATGCTGATCCG 
CCATGTG). DNA immunoprecipitated with either 
H3K4me2 or H3K9me3 was assessed by PCR with 
DXZ4-F19 (GAGATGCCCATGAACTCAAG) and 
DXZ4-R19 (GCCAGGGGGATAGGTGTG). All oHgo- 
nucleotides we used were obtained from Eurofins MWG 
Operon. 

Electrophoresis mobility shift assays 

Whole-cell extracts were prepared from HeLa S3 ceUs as 
described previously (33), with the exception that phos- 
phatase inhibitors were not included. Oligonucleotides 
were generated to the following DNA sequences along 
with oUgonucleotides to the complementary strand: 
DXZ4- Y Y 1 (CGCCCCGCACATGGCGGATCAG), 
DXZ4-YYl-Mut (CGCCCCGCATTGGGCGGA 
TCAG), DXZ4-YY1L (GGAAAAAACGCCAACAGC 
GCCCCGCACATGGCGGATCAG). We labeled the 
double-stranded ohgomers radioactively with 
T4-polynucleotide kinase (New England Biolabs) in the 
buffer designated by the manufacturer and (y-^^P) ATP 



(Perkin Elmer-Cetus). HeLa whole-cell extracts were 
incubated with tlie^^P-labeled double-stranded ohgomers 
on ice for 30 min in binding buffer (10 mM Tris, pH 7.5, 
50 mM NaCl, 1 mM DTT, 5% glycerol), in the presence of 
1 |.ig dIdC and 1 |ig of non-specific double-stranded DNA 
ohgomers. The protein-DNA complexes were then 
separated on 4% polyacrylamide gels and fixed for 
15 min (10% acetic acid, 10% methanol) before drying. 
Dried gels were exposed to a phosphoimager screen and 
scanned with a Typhoon 9410 Imager (GE Healthcare, 
Piscataway, NJ, USA). We assayed competition by 
adding cold ohgomers and performed supershifts by 
adding specific anti-YYl, anti-YY2, or non-specific 
antibody (anti-GAPDH) to the binding reactions, as 
indicated. Using double-stranded DXZ4-YY1L as a 
template, we methylated the C's of CpG dinucleotides in 
the sequence with M.SssI according to the manufacturers' 
recommendations (New England Biolabs). Methylation 
was assayed with the methyl-sensitive restriction endo- 
nuclease HinPlI (New England Biolabs). Both methylated 
and non-methylated DXZ4-YYlLs were then used in 
mobility-shift assays as described above. Ohgonucleotides 
modified with 5-methyl Cytosine were synthesized directly 
with the same sequence as DXZ4-YY1 above; only the 
three C's in the CpG context were methylated for the 
forward and reverse sequence (Eurofins MWG Operon). 
Nuclear extracts from HeLa cells stably overexpressing 
Flag-epitope tagged YYl (33) and purification of bacter- 
ially expressed non-tagged YYl protein (34) have been 
described previously. 

Bisulfite analysis 

Genomic DNA was isolated from cells with the Qiagen 
Blood and Cell Cuhure DNA Midi Kit (Qiagen, Valencia, 
CA, USA). Bisulfite modification was performed with the 
Qiagen EpiTect Bisulfite Kit according to the manufactur- 
er's instructions. We performed PCR using the primers 
DXZ4-Bis-Fl (CCAAACAAACTACCCAAAACC) and 
DXZ4-Bis-R2 (GAAGGTAGGTTAGTAAGAAGG), 
which amphfy a 535-bp fragment of modified DXZ4. 
PCR products were cloned into pDrive with the Qiagen 
PCR cloning kit. Plasmid DNA was isolated from clones 
and DNA sequence was supplied by the sequencing 
services of Eurofins MWG Operon. 



RESULTS 

The Xi is the largest mass of facultative heterochromatin 
in a female nucleus; at interphase it is typically located at 
the nuclear periphery in a structure called the Barr body 
(35). Because many chromatin proteins and euchromatin 
associated histone modifications are underrepresented in 
or absent from the Barr body, the territory of the Xi 
appears as a hole in the distribution pattern obtained 
when immunofluorescence is performed with antibodies 
to any of these chromatin features (36). Because DXZ4 
is packaged into euchromatin and bound by CTCF on the 
Xi (23), immunofluorescence for either CTCF or 
H3K4me2 readily detects an intense focus within the ter- 
ritory of the Xi, a pattern that is shared with several other 
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Figure 1. Examples of indirect immunofluorescence performed with anti-YYl (H-414, sc-1703), sliowing the distribution of YYl (green) and CTCF 
(red) in female hTERT-RPEl nuclei on the inactive X chromosome. The 4',6-diamidino-2-phenylindole (DAPI) images are shown in black and white, 
which better emphasize the densely staining Barr body. The white arrow-head points to the intense CTCF and YYl signal within the territory of the 
Barr body. The bottom panel is a third example zoomed in to show more clearly that YYl overlaps with the CTCF signal. 



DXZ4-associated proteins and chromatin modifications 
(36). 

YYl associates with DXZ4 on the inactive X chromosome 

In our indirect immunofluorescence assays, YYl showed a 
general nuclear staining pattern with an obvious 
underrepresentation at the Xi except at a single intense 
focus (Figure 1). Consistent with previous data (23,36), 
CTCF showed a general nuclear distribution and a dot 
within the Barr body. Interestingly, the Xi-associated 
YYl and CTCF signals overlapped. 



The combination of YYl immunofluorescence with 
FISH with a direct-labeled DXZ4 probe indicated that 
the Xi-associated YYl signal overlapped with DXZ4 
(Figure 2A), but examination of the DXZ4 signal 
originating from the Xa showed little to no overlap with 
DXZ4, suggesting that YYl association with DXZ4 is Xi 
specific. 

To vahdate the iiTununofluorescence and FISH data 
and to confirm that YYl is actually binding to DXZ4 
and not simply near it, we performed chromatin 
immunoprecipitation (ChIP) using anti-YYl on several 
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Figure 2. YYl associates with DXZ4 on the Xi only. (A) Examples of 
female hTERT-RPEl nuclei showing the distribution of YYl by 
indirect immunofluorescence (green) combined with direct DNA fluor- 
escence in situ hybridization with a probe for DXZ4 (red). 
DAPI-stained nuclei are shown in black and white, which better em- 
phasize the Barr body (left column). The small white arrow indicates 
the location of DXZ4 on the active X chromosome. The white arrow- 
head indicates DXZ4 on the inactive X chromosome. The bottom sets 
of panels are a close-up of an image. Overlapping signals appear 
orange. (B) YYl ChIP output assessed for DXZ4 association by 
PCR. Two male samples are shown on the left (top, hTERT-BJl; 
bottom, CCD1139Sk) and two female samples on the right (top, 
hTERT-HMEl; bottom, hTERT-RPEl). Samples assessed include the 
input (IN), eluted immunoprecipitation (IP) and an immunopre- 
cipitation with non-specific rabbit serum (RS). PCR results shown 
use DXZ4-F23 and DXZ4-R26. Immunofluorescence and ChIP were 
performed with anti-YYl (H-414, sc-1703). These results show that 
YYl associates with DXZ4 on the inactive X chromosome only. 



independent male and female cell cultures. When the 
immunoprecipitated DNA was assessed by PCR with 
primer sets covering subregions of a single 3-kb DXZ4 
monomer (data not shown), only one primer set, encom- 
passing a 186-bp fragment of DXZ4, consistently 
generated a product for the YYl ChIP, a result that was 
only observed for the female samples. Figure 2B shows 
examples for two independent male samples (left panels) 
and two female samples (right panels). YYl is therefore 
associated with DXZ4 on the Xi only. 

YYl binds to a histone alpha-like consensus sequence in 
DXZ4 

When the 3-kb DNA sequence of a single DXZ4 repeating 
unit was examined for putative YYl-binding sites, a 
match was found with the 7-bp alpha element (37), a 
YYl-binding motif located in the coding region- 
activating sequence of the H2A.2 and H3.2 histone 
genes (38) (Figure 3A). Importantly, this sequence motif 
resides within the 186-bp sequence that is positive for YYl 
Chip. Alignment of the alpha element match with the 
DNA sequence of 60 different DXZ4 monomers from 
nine independent sources (8) revealed that the motif is 
100% conserved (data not shown). 

When double stranded DNA oligomers were generated 
for the 22-bp DXZ4 sequence shown in Figure 3A and 
labeled with^^^P, electrophoresis mobility shift assays 
(EMSA) revealed a clear shift in mobility in the presence 
of whole-cell extract. The shift could be specifically 
reduced in the presence of excess unlabeled ohgomer but 
not when binding competition was performed with ohgo- 
mers of which the core CAT sequence had been mutated 
(39) (Figure 3B, left four lanes). Binding in the presence of 
one of two independent anti-YYl antibodies, but not with 
a non-specific antibody, resulted in a supershift of the 
labeled DNA, confirming that YYl in the ceU extract is 
binding to the predicted site in DXZ4 (Figure 3B, right 
three lanes). 

In humans, a retrotransposed homolog of YYl termed 
YY2 resides at Xp22. 1-22.2 (40). It shares >95% 
amino-acid identity across the zinc-finger DNA-binding 
domain of YYl (40) and recognizes similar DNA 
sequence motifs (41). Using a YY2 antibody, we did not 
see a supershift for the DXZ4 YYl target site (Figure 3C). 
This result suggests that the shift that we do see is 
generated by YYl. Data supporting YYl binding to 
DXZ4 in vitro are shown in Figure 3B, where an 
antibody raised to a portion of the amino-terminus of 
YYl that is less well conserved in YY2 (ABl) shows a 
supershift comparable to that for antibodies raised to 
full-length YYl. 

Binding of YYl to its target sequence in DXZ4 is not 
affected by CpG methylation in vitro 

Collectively these data support Xi-specific association of 
YYl with DXZ4, at a site <80 nt from the Xi-specific 
CTCF-binding site (23). Like YYl, CTCF is a 
multi-zinc-finger DNA-binding protein with important 
roles in epigenetic regulation of gene expression and chro- 
matin organization (42). At a number of genomic sites. 
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Figure 3. YYl electrophoresis mobility-shift assays (EMSA). (A) 
Alignment of the candidate YYl-binding sequence of the macrosatellite 
DXZ4 (top) with two other well characterized YYl binding consensus 
sequences. The core CAT element is highlighted. Below the alignments 
is the synthesized mutant (Mut.) sequence that no longer contains the 
CAT motif that was used in the EMSA. (B) Phosphoimager output of a 
dried EMSA gel. Radioactive bands appear dark. Samples loaded in 
each lane are labeled across the top and include specific competition 
(SC) and non-specific competition (NSC). Supershift analysis is shown 
in the last three lanes to the right with two independent YYl antibodies 
and a non-specific control. The locations of the shift (S) and supershift 
(SS) are indicated by the arrowheads to the right of the gel. Free probe 
(FP) is indicated at the bottom of the gel. (C) Supershift analysis with 
anti-YYl and anti-YY2. (D) Extended DNA sequence flanking the 
YYl core CAT element (highlighted in black). CpG dinucleotides 



including imprinted regions, binding of CTCF to DNA is 
impaired by CpG metliylation (43^6). In contrast, CTCF 
binding to its DXZ4 target site in vitro is not affected by 
CpG metliylation (23), perhaps because the CTCF site in 
DXZ4 does not directly contain any CpG residues. 
Binding of YYl to DNA is reported to be impaired by 
CpG methylation at some sites (47) but not at others (48). 
The YYl-binding site in DXZ4 contains one CpG di- 
nucleotide within the central 7-bp motif and several at 
other CpG sites nearby (Figure 3D). Previously, we have 
determined that these CpG sequences are largely 
methylated both in males and on the Xa, whereas the 
same sites are unmethylated on the Xi (23) 
(Supplementary Figure SI). We therefore sought to deter- 
mine whether methylation of the YYl-binding site was 
sufficient to block binding and hence to restrict YYl to 
the Xi. 

A double-stranded 39-bp sequence encompassing the 
YYl-binding site and three upstream CpG dinucleotides 
was generated as the target for methylation sensitive 
EMSA (Figure 3D). One CpG dinucleotide resides 
within a recognition sequence (GCGC) for HiiiPlI, a re- 
striction endonuclease unable to digest target sites when 
the central CpG is methylated. Using this feature as a 
measure of complete methylation, we methylated the 
CpG sites in the 39-bp oUgomer in vitro using M.SssF 
Methylated and unmethylated oligomers were then sub- 
jected to HinPlI digestion, which confirmed that the 
oligomer was fully methylated (Figure 3E). The EMSA 
experiment was then repeated with either the methylated 
or unmethylated target sequence. As can be seen in Figure 
3F, methylation of the target site appears to have had no 
effect on YYl binding, so hke CTCF, which binds the 
methylated form of its DXZ4-binding site in vitro (23), 
YYl requires more than just DNA methylation to 
prevent binding to DXZ4 on the Xa in vivo. 

The clear blocking of HinPlI digestion of the 
methylated double-stranded ohgonucleotide (Figure 3E) 
does not confirm efficient methylation of the other CpG 
dinucleotides in the DNA fragment. To complement this 



are highlighted by the dots below the sequence, and the location of 
the HinPlI site is highhghted in gray (GCGC). (E) Agarose gel (4.0%) 
analysis showing results of HinPlI digestions of the DNA sequence 
shown in (C), each of which was first treated or not with M.SssI, 
which methylates CpG dinucleotides in vitro. Presence or absence of 
either enzyme in the procedure is indicated above the corresponding 
lanes of the gel by the plus or minus symbol. DNA size is indicated to 
the right by a 10-bp ladder. (F) Phosphoimager output of a dried 
EMSA gel with the DNA sequence shown in (C). Radioactive bands 
appear dark. Target DNA sequence that has not (left two samples) or 
has been (right two samples) treated with M.SsspI before analysis is 
indicated above the gel by plus and minus symbols. Inclusion of 
whole-cell extract is indicated below. The location of the shift is 
indicated by the small black right-facing arrows. (G) Methylation in- 
sensitivity and YYl specificity EMSA. EMSA shows shift with HeLa 
nuclear extract (Extract), a molecular weight shift for Flag-epitope- 
tagged YYl from nuclear extract from HeLa cells overexpressing 
YYl-Flag (Extract Flag-YYl), and shift with purified non-tagged 
YYl from bacteria (Purified YYl). Analyses are shown side by side, 
non-methylated double-stranded oligonucleotides to the left and 
CpG-methylated ones to the right. The arrows to the right of the gel 
indicate the Flag shift (top) and non-tagged shift (bottom). 
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analysis, we therefore synthesized a sense and anti-sense 
oligonucleotide, corresponding to the DXZ4 sequence 
shown in Figure 3A, in which the three Cs in the CpG 
context were methylated. Side-by-side analysis of the 
unmethylated and methylated double-stranded oHgo- 
nucleotides revealed no impact on the ability of endogen- 
ous YYl, Flag-tagged YYl, or bacterially purified YYl to 
bind to the sequence (Figure 3G). These data confirm not 
only that methylation of CpG does not block YYl 
binding in vitro but also that YYl is responsible for the 
EMSA as demonstrated by the shift corresponding to the 
Flag-tagged YYl and a shift with purified YYl. 

YYl and CTCF bind to the hypomethylated form of 
DXZ4 on the active X chromosome in some male 
carcinomas 

Changes in DNA methylation are a common feature in 
cancer (49), affecting both unique and repetitive compo- 
nents of the genome. Indeed, in various cancers, increases 
and decreases in DNA methylation have been reported 
for large tandem-repeat DNA such as D4Z4 and NBL-2 
(50-52). To date, changes in CpG methylation at DXZ4 in 
cancer have not been investigated. Although CpG methy- 
lation does not block binding of YYl (Figure 3E) 
or CTCF (23) to their DXZ4 target sites in vitro, both 
proteins are restricted in vivo to the hypomethylated 
macrosatelhte on the Xi. Previously, we and others have 
shown that most DXZ4 CpG dinucleotides are methylated 
in males (7,23), so we investigated a number of male car- 
cinoma cell fines for a reduction in DXZ4 methylation 
by bisulfite sequencing with the premise that any such re- 
duction in methylation might result in altered accessibihty 
of YYl and CTCF to DXZ4. 

Across the interval encompassing the bidirectional 
promoter and binding sites for YYl and CTCF, three 
independent colon carcinoma cell lines (HCT15, Caco-2 
and SW1116) had CpG methylation profiles indistin- 
guishable from those of normal males (23) (Figure 4A) 
(Supplementary Figure SI). Notably, two CpG residues 
frequently appear resistant to methylation (CpG- 11 and 
less frequently CpG-30 from the left) in normal human 
males (23) (Supplementary Figure SI), as well as in 
pig-tailed macaque DXZ4 (20). The CpG- 11 resides ap- 
proximately midway between the YYl and CTCF-binding 
sites and CpG-30 within the defined promoter. The rele- 
vance of these largely methylation-resistant CpGs is not 
obvious but is intriguing. 

A fourth colon-carcinoma cell line (DLD-1) also shows 
a typical male CpG methylation profile at DXZ4 (Figure 
4B) and more importantly has a profile similar to that of 
HCT-15 (Figure 4A); the two cell lines are believed to be 
derived independently from the same cancer specimen 
(53,54), so any differences in methylation profiles are 
unhkely to result from cell-hne derivation. In contrast, a 
fifth colon-carcinoma cell line (HCT116) and a fibrosar- 
coma cell line (HT-1080) show lower methylations — only 
16.2 and 52.1% CpG methylation, respectively 
(Figure 4B). 

To investigate changes to chromatin organization at 
DXZ4, we performed ChIP on HCT116 and HT-1080 



along with DLD-1 (Figure 4C). As in male primary cells 
(23), the heterochromatin marker H3K9me3 was detected 
in all three samples. The euchromatin marker H3K4me2 
could readily be detected at DXZ4 in HCT116 and 
HT-1080, confirming a euchromatic organization of the 
macrosatelhte consistent with the CpG hypomethylation 
profile (Figure 4B). Low levels of H3K4me2 could also 
be detected in DLD-1, possibly because DXZ4 is ex- 
pressed at varying levels in males (8,23) and this signal 
might therefore represent transcriptionally active DXZ4 
monomers. 

Transformation-associated chromatin changes at DXZ4 
allow CTCF and YYl to bind DXZ4 on the active X 
chromosome 

Next we sought to determine what changes, if any, occur 
to the methylation profile and chromatin organization of 
DXZ4 in association with metastasis and transformation. 
First we determined and compared the DXZ4 CpG 
methylation profile of SW480, estabhslied from a 
primary colon adenocarcinoma, with that of SW620, 
derived from a metastatic lesion at a later stage of 
colon carcinoma in the same individual (55). In this 
instance the methylation profiles were very similar; 
overall CpG methylation was slightly higher in SW620 
(Figure 4D), suggesting no dramatic alteration in DXZ4 
chromatin organization. Then we examined the CpG 
methylation profile of DXZ4 in skin primary fibroblasts 
(Malme-3) and a cell fine derived from a mahgnant 
melanoma from the same individual (Malme-3M). 
DXZ4 CpG methylation for Malme-3 was comparable 
to that seen in other normal male fibroblasts 
(Supplementary Figure SI) (79.4%), whereas a reduction 
in methylation was noticeable in the melanoma cell fine 
(64.1%) (Figure 4E). ChlP, intended to explore possible 
chromatin changes associated with the reduced DNA 
methylation at DXZ4, once again revealed low levels of 
the euchromatic marker H3K4me2 in the primary male 
fibroblast cells (Figure 4F). This result was not too 
surprising given that >20% of CpG residues in the 
interval are unmethylated (Figure 4E) and that low 
levels of H3K4me2 could be detected in DLD-1, a 
male sample with higher overall CpG methylation 
(Figure 4C), but when CTCF was used to mark a shift 
from the male X/Xa arrangement toward a female 
Xi-hke organization, CTCF could only be detected in 
the melanoma cell fine (Figure 4F), consistent with our 
observations for carcinoma cell fines HT-1080 and 
HCT116 (Figure 4C). Furthermore, we detected binding 
of YYl to DXZ4 in Malme-3M but not in Malme-3 
(Figure 4F). The low YYl signal in Malme-3M may 
arise because the YYl -binding region retains overall 
higher methylation than is seen in HCT116 and 
HT-1080 (compare Figure 4B with E), suggesting that 
CpG methylation affects YYl binding to DXZ4 
in vivo. Importantly, this result shows that transform- 
ation was associated with a reduction in CpG methyla- 
tion and a gain of CTCF and YYl binding at DXZ4 in 
this individual. 
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Figure 4. DXZ4 CpG methylation and chromatin structure in male carcinomas and transformation-associated changes. (A) Schematic representation 
(top) of a single 3-kb DXZ4 monomer shown right to left on the basis of the orientation on Xq. The region assessed for CpG methylation is 
highlighted, and the coordinates for the interval amplified with the forward and reverse primers given above the monomer. Region of interest 
(bottom) expanded to show the location of the promoter along with the CTCF and YYl-binding sites. The line below it shows the location of all 
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Predicting DXZ4 CpG methylation profile from 
euchromatic chromatin signatures and CTCF association 
with the macrosatellite 

Our data support a model in which DXZ4 on the Xa and 
male X is packaged into constitutive heterochromatin 
characterized by H3K9me3 and CpG methylation. In 
contrast, DXZ4 on the Xi is largely packaged into eu- 
chromatin characterized by H3K4me2 and CpG 
hypomethylation and is bound by YYl and CTCF. 
Transformation-associated reduction in DNA methyla- 
tion at the macrosatellite, however, alters the chromatin 
organization sufficiently to permit binding of YYl and 
CTCF to male DXZ4, shifting it into a more Xi-like 
state. On the basis of this model, we would predict that 
observation of YYl or CTCF at DXZ4 in male cells 
would indicate an overall hypomethylated CpG state. To 
test this hypothesis, we examined the chromatin profile for 
CTCF at DXZ4 using the publicly available Encode data 
from Chip experiments combined with high-throughput 
sequencing (56). 

Consistent with our observations (23), DXZ4 in normal 
female but not male cells is characterized by the presence 
of euchromatic markers H3K4me2 and histone H3 
lysine-9 acetylation (H3K9Ac) and binding of CTCF 
(Figure 5A), but in the male hepatocellular carcinoma 
cell line HepG2, DXZ4 chromatin organization is indis- 
tinguishable from the female profiles. Analysis of the CpG 
methylation status of DXZ4 in HepG2 cells reveals a 
hypomethylated state comparable to that of HCT116 
and HT-1080 (Figure 5B), supporting transformation- 
associated changes to DXZ4 chromatin organization on 
the male X chromosome. 



DISCUSSION 

DXZ4 is an enigmatic DNA element that adopts a chro- 
matin organization different from that of the flanking 
chromosome: largely heterochromatin on the Xa and eu- 
chromatin bound by CTCF on the Xi (23). Here, we 
report that the zinc-finger protein YYl associates specif- 
ically with the euchromatic form of DXZ4 alongside 
CTCF on the Xi. Colocalization of YYl and CTCF has 
been reported at binding sites within tandem repeat DNA 
at autosomal imprinted loci (57) and the Xic (32). The 



theme that unifies these three situations is that all are 
classic examples of epigenetics — maintenance of two 
alleles in alternate chromatin states — and support a 
model in which chromatin marked by YYl and CTCF 
in close proximity has an important role in establishing 
and/or maintaining an alternate chromatin organization 
on one chromosome. 

In vitro analysis showed that both YYl and CTCF (23) 
can bind methylated DXZ4 target sequences, yet in vivo 
neither is detected at such sequences. Therefore, methyla- 
tion alone cannot account for the allelic exclusion of either 
protein. One explanation could be that in vivo methylated 
DXZ4 DNA is inaccessible. Given that the DXZ4 
macrosatelhte on the Xa and in males is packaged in con- 
stitutive heterochromatin characterized by H3K9me3 (23) 
and HPlg (18), this supposition is not unreasonable. 
Indeed, binding of CTCF and YYl does closely follow 
the methylation profile of DXZ4, as demonstrated by 
the binding of both proteins to hypomethylated DXZ4 
in two independent male carcinoma cell fines (HCT116 
and HT-1080) (Figure 4C), and a mahgnant melanoma 
cell line, but not in primary skin cells from the same 
male (Figure 4F), and CTCF binding to the 
hypomethylated macrosatelhte in HepG2 cells (Figure 
5A). Interestingly, these independent male cancer-cell 
fines showed a shift of DXZ4 chromatin organization 
from a typical Xa state toward one that more closely 
resembled that on the Xi in females. Conceivably, this 
arrangement is the default for DXZ4, and in these 
samples it is reverting back toward this base state. If so, 
we would predict that DXZ4 in pluripotent cells would 
exist in a euchromatic arrangement, a hypothesis we are 
actively pursuing. 

Our in vitro data indicate that YYl bound to DXZ4, 
but given that YY2 recognizes similar DNA sequence 
motifs (41) and possesses a near identical DNA binding 
domain (40), YY2 may also be associating in vivo with 
DXZ4. Because YYl and YY2 are not functionally redun- 
dant and can mediate antagonistic effects at target sites 
(58,59), possible binding of YY2 at DXZ4 warrants 
further investigation. 

YYl has also been shown to associate with the chromo- 
some 4q macrosatellite D4Z4 (60), extending the parallels 
between this disease-associated macrosatellite and DXZ4 



Figure 4. Continued 

CpG dinucleotides, represented by the circles on sticks. Below the schematic map are the methylation profiles for three different male colon 
carcinoma cell lines. Each horizontal row of 36 circles represents the sequencing result of an independent TA clone from cloning of PCR 
products generated from bisulfite-modified DNA template for the corresponding cell lines. Methylated CpGs are represented by filled circles, 
unmethylated ones by open circles. Sequence variations at DXZ4 that result in a non-C at a consensus CpG site are represented by dashes. CpG 
dinucleotides contained in the YYl-binding site and promoter are indicated above each profile. The upward-facing arrowheads indicate the 
commonly hypomethylated CpG residues CpG- 11 and CpG-30. The percentage methylation is given in brackets beside each name. (B) 
Representation as in (A) showing CpG methylation profiles for two other male colon-carcinoma cell lines and one male fibrosarcoma cell line. 
(C) Chip analysis of DXZ4 chromatin for the male cell line shown immediately to the left in (B). Data are shown for H3K4me2, H3K9me3, CTCF, 
and YYl as indicated to the right of each agarose gel image and are representative examples from three independent ChIP replicates for each cell 
line. Samples assessed include the input (IN), the eluted immunoprecipitation (IP), and an immunoprecipitation with non-specific rabbit serum (RS). 
YYl and CTCF were assessed with DXZ4-F23 and DXZ4-R26, whereas H3K4me2 and H3K4me3 were assessed with DXZ4-F19 and DXZ4-R19. 
Chip was performed with anti-YYl (sc-1703). (D) Representation as in (A) showing CpG methylation profiles for a cell line derived from a primary 
colon adenocarcinoma (SW480) and a metastatic lesion at a later stage of colon carcinoma in the same individual (SW620). (E) Representation as in 
(A) showing CpG methylation profiles from a male normal skin fibroblast culture (top) and a malignant melanoma cell line from the same individual 
(bottom panel). (F) Analysis as in (C) showing H3K4me2, H3K9me3, CTCF and YYl ChIP data for the samples shown immediately on the left in 
(E). Data are representative examples from two independent replicate ChIP experiments. 
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Figure 5. Correlation between CpG methylation and chromatin organization of DXZ4. (A) Tlie CiilP-Seq profile for part of the DXZ4 
macrosatellite [Encode histone modifications by Broad Institute ChIP Seq (64)]. The locations of individual 3-kb monoiners are inarked by left-facing 
arrows. The boxed shaded area in each arrow indicates the region containing the promoter and binding sites for CTCF and YYl. Below are the 
ChlP-Seq profiles for CTCF, H3K4me2, and H3K9Ac as indicated to the right. Profiles include the male hepatocellular carcinoma cell line HepG2; 
two independent normal female lines [top, lymphoblastoid cell line (GM 12878); bottom, human mammary epithelial cells (HMEC)]; and two 
independent normal males [top, human skeletal-muscle myoblasts (HSMM); bottom, human umbilical-vein endothelial cells (HUVEC)]. The 
linage was taken from the UCSC genome browser (http://genome.ucsc.edu) and shows data taken with default settings from genome build hgl8, 
showing a 10-kb window; coordinates X; 114 881 001-114 891 000. The data shown in all tracks are the signal view and used the default scale of 0-50 
as indicated to the left. (B) Schematic representation (top) of a single 3-kb DXZ4 monomer shown as a left-facing arrow. The region assessed for 
CpG methylation is indicated by the shaded region and the coordinates given. Methylation profile (bottom) of this interval for HepG2. Methylated 
CpGs are represented by filled circles, unmethylated ones by open circles. Sequence variations at DXZ4 that result in a non-C at a consensus CpG 
site are represented by dashes. The percentage methylation is given to the left of the profile. 



(61), but ill this instance, YYl is associated, as part of a 
repressor complex (60), with the normal heterochromatic 
form of D4Z4, and CTCF only associates upon contrac- 
tion in FSHD (17) when D4Z4 reverts toward a more 
eucliromatic conformation (18). Whether YYl remains 



associated with D4Z4 in its euchromatic contracted form 
remains to be determined. 

Collectively, these data provide novel insight into the 
organization and stability of chromatin at the 
macrosatellite DXZ4. Further parallels can be drawn 
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between DXZ4 and D4Z4, as well as monoallelic chroma- 
tin states at sites of genomic imprinting and the Xic. 
The ability of DXZ4 to adopt two distinct chromatin 
states in the context of XCI (7,20,23), in the absence of 
the macrosatelhte contraction gain of function seen for 
D4Z4 in FSHD (17,18), strongly suggests that DXZ4 
fulfils, through epigenetic regulation, two alternate 
conserved (20) functions on the X chromosome — one 
packaged as heterochromatin on the Xa and the other in 
a euchromatic conformation of the Xi. Although only the 
XIC located at Xql3 is required for XCI in humans (62), 
DXZ4 may still have some role after XCI in spreading the 
XCI signal or facilitating organization of the Xi. Recent 
data from Joen and Lee (63) indicate that Yyl possesses 
both DNA and RNA-binding activities and physically 
tethers Xist RNA to the Xi. This exciting result is a 
major step forward in deciphering how Xist associates ex- 
clusively with the Xi. Xist spreads in cis along the length 
of the Xi (28), so numerous other Xist entry sites defined 
by Xi-specific Yyl might facilitate this process. The 
Xi-specific association of YYl at DXZ4 could function 
to assist in the spread of XIST along the length of the 
chromosome. 

Macrosatelhtes remain a perplexing and largely unex- 
plored aspect of our genome (5). We anticipate that fu- 
ture studies will reveal that these DNA elements occupy 
an important niche in genome regulation and disease 
susceptibihty. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR OnHne: 
Supplementary Figure SI, Supplementary Reference [23]. 
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